由于其在自主驾驶中的应用,因此基于单眼图像的3D感知已成为一个活跃的研究领域。与基于激光雷达的技术相比,单眼3D感知(包括检测和跟踪)的方法通常会产生较低的性能。通过系统的分析,我们确定了每个对象深度估计精度是界限性能的主要因素。在这种观察过程中,我们提出了一种多级融合方法,该方法将不同的表示(RGB和伪LIDAR)和跨多个对象(Tracklets)的时间信息结合在一起,以增强对目标深度估计。我们提出的融合方法实现了Waymo打开数据集,KITTI检测数据集和Kitti MOT数据集的每个对象深度估计的最新性能。我们进一步证明,通过简单地用融合增强的深度替换估计的深度,我们可以在单眼3D感知任务(包括检测和跟踪)方面取得重大改进。
translated by 谷歌翻译
We present a new algorithm for automatically bounding the Taylor remainder series. In the special case of a scalar function $f: \mathbb{R} \mapsto \mathbb{R}$, our algorithm takes as input a reference point $x_0$, trust region $[a, b]$, and integer $k \ge 0$, and returns an interval $I$ such that $f(x) - \sum_{i=0}^k \frac {f^{(i)}(x_0)} {i!} (x - x_0)^i \in I (x - x_0)^{k+1}$ for all $x \in [a, b]$. As in automatic differentiation, the function $f$ is provided to the algorithm in symbolic form, and must be composed of known elementary functions. At a high level, our algorithm has two steps. First, for a variety of commonly-used elementary functions (e.g., $\exp$, $\log$), we derive sharp polynomial upper and lower bounds on the Taylor remainder series. We then recursively combine the bounds for the elementary functions using an interval arithmetic variant of Taylor-mode automatic differentiation. Our algorithm can make efficient use of machine learning hardware accelerators, and we provide an open source implementation in JAX. We then turn our attention to applications. Most notably, we use our new machinery to create the first universal majorization-minimization optimization algorithms: algorithms that iteratively minimize an arbitrary loss using a majorizer that is derived automatically, rather than by hand. Applied to machine learning, this leads to architecture-specific optimizers for training deep networks that converge from any starting point, without hyperparameter tuning. Our experiments show that for some optimization problems, these hyperparameter-free optimizers outperform tuned versions of gradient descent, Adam, and AdaGrad. We also show that our automatically-derived bounds can be used for verified global optimization and numerical integration, and to prove sharper versions of Jensen's inequality.
translated by 谷歌翻译
Airbnb is a two-sided marketplace, bringing together hosts who own listings for rent, with prospective guests from around the globe. Applying neural network-based learning to rank techniques has led to significant improvements in matching guests with hosts. These improvements in ranking were driven by a core strategy: order the listings by their estimated booking probabilities, then iterate on techniques to make these booking probability estimates more and more accurate. Embedded implicitly in this strategy was an assumption that the booking probability of a listing could be determined independently of other listings in search results. In this paper we discuss how this assumption, pervasive throughout the commonly-used learning to rank frameworks, is false. We provide a theoretical foundation correcting this assumption, followed by efficient neural network architectures based on the theory. Explicitly accounting for possible similarities between listings, and reducing them to diversify the search results generated strong positive impact. We discuss these metric wins as part of the online A/B tests of the theory. Our method provides a practical way to diversify search results for large-scale production ranking systems.
translated by 谷歌翻译
公开演讲期间的压力很普遍,会对绩效和自信产生不利影响。已经进行了广泛的研究以开发各种模型以识别情绪状态。但是,已经进行了最少的研究,以实时使用语音分析来检测公众演讲期间的压力。在这种情况下,当前的审查表明,算法的应用未正确探索,并有助于确定创建合适的测试环境的主要障碍,同时考虑当前的复杂性和局限性。在本文中,我们介绍了我们的主要思想,并提出了一个应力检测计算算法模型,该模型可以集成到虚拟现实(VR)应用程序中,以创建一个智能的虚拟受众,以提高公开讲话技能。当与VR集成时,开发的模型将能够通过分析与指示压力的生理参数相关的语音功能来实时检测过度压力,并帮助用户逐渐控制过度的压力并改善公众演讲表现
translated by 谷歌翻译
普通的卷积神经网络(CNNS)已被用于在过去几年中的各个域中实现最先进的性能,包括通过眼睛运动的生物识别认证。普通CNNS已经有许多相对较近的改进,包括残差网络(RESNET)和密集连接的卷积网络(DENSENET)。虽然这些网络主要是目标图像处理域,但它们可以很容易地修改以使用时间序列数据。我们采用DENSenet架构,通过眼睛运动来实现端到端的生物认证。我们将我们的模型与最相关的现有作品进行比较,包括当前最先进的工作。我们发现我们的模型实现了所有考虑的培训条件和数据集的最先进的性能。
translated by 谷歌翻译
无监督的异常检测对于未来在大型数据集中搜索稀有现象的分析可能至关重要,例如在LHC收集的。为此,我们介绍了一个受到物理启发的变量自动编码器(VAE)体系结构,该体系结构在LHC奥运会机器学习挑战数据集中竞争性和稳健性。我们证明了如何将某些物理可观察物直接嵌入VAE潜在空间中,同时使分类器显然是不可知的,可以帮助识别和表征测得的光谱中的特征,这是由于数据集中存在异常而引起的。
translated by 谷歌翻译
为了实现对日常生活的人类常识,机器学习系统必须理解和理解环境中其他代理人的目标,偏好和行动。在他们的第一年的生命结束时,人类婴儿直观地实现了如此常识,这些认知成就为人类丰富而复杂地了解他人的心理状态。Can Machines可以实现更广泛的,致辞推理对人类婴儿这样的其他药剂吗?婴儿直觉的基准(围兜)挑战机器,以预测代理人行为的合理性,基于其行动的基本原因。由于BIB的内容和范式从发育认知科学中采用,因此BIB允许在人类和机器性能之间直接比较。尽管如此,最近提出的深度学习的机构推理模型未能表现出婴儿的推理,让围兜成为一个开放的挑战。
translated by 谷歌翻译
Discriminative neural networks offer little or no performance guarantees when deployed on data not generated by the same process as the training distribution. On such out-of-distribution (OOD) inputs, the prediction may not only be erroneous, but confidently so, limiting the safe deployment of classifiers in real-world applications. One such challenging application is bacteria identification based on genomic sequences, which holds the promise of early detection of diseases, but requires a model that can output low confidence predictions on OOD genomic sequences from new bacteria that were not present in the training data. We introduce a genomics dataset for OOD detection that allows other researchers to benchmark progress on this important problem. We investigate deep generative model based approaches for OOD detection and observe that the likelihood score is heavily affected by population level background statistics. We propose a likelihood ratio method for deep generative models which effectively corrects for these confounding background statistics. We benchmark the OOD detection performance of the proposed method against existing approaches on the genomics dataset and show that our method achieves state-of-the-art performance. We demonstrate the generality of the proposed method by showing that it significantly improves OOD detection when applied to deep generative models of images.
translated by 谷歌翻译
Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive uncertainty. Quantifying uncertainty is especially critical in real-world settings, which often involve input distributions that are shifted from the training distribution due to a variety of factors including sample bias and non-stationarity. In such settings, well calibrated uncertainty estimates convey information about when a model's output should (or should not) be trusted. Many probabilistic deep learning methods, including Bayesian-and non-Bayesian methods, have been proposed in the literature for quantifying predictive uncertainty, but to our knowledge there has not previously been a rigorous largescale empirical comparison of these methods under dataset shift. We present a largescale benchmark of existing state-of-the-art methods on classification problems and investigate the effect of dataset shift on accuracy and calibration. We find that traditional post-hoc calibration does indeed fall short, as do several other previous methods. However, some methods that marginalize over models give surprisingly strong results across a broad spectrum of tasks.
translated by 谷歌翻译
We present a variational approximation to the information bottleneck of Tishby et al. (1999). This variational approach allows us to parameterize the information bottleneck model using a neural network and leverage the reparameterization trick for efficient training. We call this method "Deep Variational Information Bottleneck", or Deep VIB. We show that models trained with the VIB objective outperform those that are trained with other forms of regularization, in terms of generalization performance and robustness to adversarial attack.
translated by 谷歌翻译